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Abstract — The study investigates the applicability of linear 
regression and ANN models for estimating weekly reference 
evapotranspiration (ET ) at Tirupati, Nellore, Rajahmundry, 
Anakapalli and Rajendranagar regions of Andhra Pradesh. 
The climatic parameters influencing ET were identified 
through multiple and partial correlation analysis. The 
sunshine, temperature, wind velocity and relative humidity 
mostly influenced the study area in the weekly ET estimation. 
Linear regression models in terms of the climatic parameters 
influencing the regions and, optimal neural network 
architectures considering these climatic parameters as inputs 
were developed. The models' performance was evaluated with 
respect to ET estimated by FAO-56 Penman-Monteith method. 
The linear regression models showed a satisfactory 
performance in the weekly ET estimation in the regions 
selected for the present study. The ANN (4,4,1) models, 
however, consistently showed a slightly improved performance 
over linear regression models. 

Index Terms — Reference evapotranspiration, multiple linear 
regression, artificial neural network, performance evaluation 

I. INTRODUCTION 

An accurate estimation of reference crop 
evapotranspiration (ET ) is of paramount importance for 
designing irrigation systems and managing natural water 
resources. Numerous ET equations have been developed 
and used according to the availability of historical and current 
weather data. These equations range in sophistication from 
empirical to complex equations. The FAO-56 Penman- 
Monteith (PM) equation [1] is widely used in recent times for 
ET estimation. However, the difficulty in using this equation, 
in general, is the lack of accurate and complete data. In 
addition, the parameters in the equation potentially introduce 
certain amount of measurement and/or computational errors, 
resulting in cumulative errors in ET estimates. Under these 
conditions, a simple empirical equation that requires as few 
parameters as possible and, results comparable with Penman- 
Monteith method is preferable. Owing to the difficulties 
associated with model structure identification and parameter 
estimation of the nonlinear complex evapotranspiration 
process, most of the models that have been developed may 
not yield satisfactory results. ANNs are capable of modelling 
complex nonlinear processes effectively extracting the relation 
between the inputs and outputs of a process without the 
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physics being explicitly provided to them and also, they 
identify the underlying rule even if the data is noisy and 
contaminated with errors [2] and [3]. "Reference [4]" 
investigated the utility of ANNs for the estimation of daily 
ET Q and compared the performance of ANNs with PM method. 
It was concluded that ANNs can predict ET better than the 
conventional methods. "Reference [5]" examined the potential 
of artificial neural networks in estimating the actual 
evapotranspiration from limited climatic data and suggested 
that the crop evapotranspiration could be computed from air 
temperature using the ANN approach. "Reference [6]" showed 
that ANNs can be used for forecasting ET with high reliability. 
"Reference [7]" derived solar radiation and net radiation 
based ET equations using multi- linear regression technique 
and concluded that the equations performed better than the 
simplified temperature and/or radiation based methods for 
humid climates. "Reference [8]" tested the ANNs for 
estimating ET Q as a function of maximum and minimum air 
temperatures and concluded that when taking into account 
just the maximum and minimum air temperatures, it is possible 
to estimate ET Q . "Reference [9]" tested the ANNs, to estimate 
ET Q as a function of the maximum and minimum air 
temperatures in semiarid climate. While comparing with PM 
method, it was concluded that ANN methods are better for 
ET Q estimates than the conventional methods. "Reference 
[10]" evaluated ANN models for daily ET Q estimation under 
situations of presence of only temperature and relative 
humidity data. ANNs showed an improved performance over 
traditional ET Q equations. "Reference [11]" developed 
generalized artificial neural network (GANN) based reference 
crop evapotranspiration models corresponding to FAO-56 
PM, FAO-24 Radiation, Turc and FAO-24 Blaney-Criddle 
methods using the data from California Irrigation Management 
and Information System stations. It was concluded that the 
GANN models can be used directly to predict ET under the 
arid conditions since they performed better than the 
conventional ET estimation methods. "Reference [12]" 
compared weekly evapotranspiration ANN based forecasts 
with regard to a model based on weekly averages and found 
an improved performance of one week in advance weekly 
ET Q predictions compared to the model based on means (mean 
year model). 
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II. MATERIALS AND METHODS 

The climatic data at Tirupati, Nellore, Rajahmundry, 
Anakapalli and Rajendranagar meteorological centers 
collected from the India Meteorological Department (IMD), 
Pune, India were used in the data analysis and model 
development. A part of the data was used for the purpose of 
development of models and the rest for validating the models 
developed. The resemblance of the statistical structure in 
terms of mean, variance and skewness of the calibration and 
validation data sets was ensured while making the division 
of the data into training and testing data sets. A brief 
description of the meteorological centers along with the data 
period is shown in TABLE I. In the present study, an attempt 
is made to develop simple linear and optimal neural network 
models considering the climatic parameters influencing the 
regions selected for the study for weekly ET estimation. The 
study also compares the performance of proposed linear 
regression and ANN models. 

m. MODEL DEVELOPMENT 

The weekly reference evapotranspiration model at a 
meteorological center is developed using the climatic data at 
the center. The steps in the modelling include i) identification 
of meteorological parameters influencing the region, ii) 
development of the model and iii) performance evaluation of 
the model developed. The identification of meteorological 
parameters influencing the region is based on multiple and 
partial correlation analysis. The linear regression and ANN 
models are developed for the present study. The performance 
of the models is verified through selected performance 
evaluation criteria. 

A. Linear Regression (LR) model 

The objective of the model is the transfer of information 
among several variables observed simultaneously and the 
estimation of the dependent variable from the several other 
observed independent variables. 



The weekly reference evapotranspiration (ET ) at a 
meteorological center is expressed as a simple linear model 
as 

ET ( =C + a 1 X 1 + a 2 X 2 +... (1) 

where a 15 a 2 , and C are empirical constants and X p X 2 , .... 

are the meteorological parameters influencing the region. The 
multiple correlation analysis was carried out using STASTICA 
package. 

B. Artificial Neural Network model (ANN) 

A standard multilayer feed-forward ANN with logistic 
sigmoid function was adopted for the present study. A 
constant value of 0. 1 for learning rate and a constant value of 
0.9 for momentum factor were considered. The data were 
normalized in the range of (0. 1, 0.9) to avoid any saturation 
effect. Error back propagation which is an iterative nonlinear 
optimization approach based on the gradient descent search 
method [13] was used during calibration. The calibration set 
was used to minimize the error and validation set was used to 
ensure proper training of the neural network employed such 
that it does not get overtrained. The performance of the model 
was checked for its improvement on each iteration to avoid 
overlearning. The optimal network corresponding to minimum 
mean squared error was obtained through trail and error 
process. Care was taken to avoid too few and too many 
neurons which can respectively cause difficulties in mapping 
each input and output in the training set and increase training 
time unnecessarily, in the process of determination of optimal 
number of hidden layers and nodes in each hidden layer. The 
process was carried out using MATLAB routines. 

IV PERFORMANCE EVALUATION CRITERIA 

The performance evaluation criteria used in the present 
study are the coefficient of determination (R 2 ), root mean 
square error (RMSE) and efficiency coefficient (EC). 
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BRIEF DESCRIPTION OF METEOROLOGICAL CENTERS 
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A. Coefficient of Determination (R 2 ) 

It is the square of the correlation coefficient (R) and the 
correlation coefficient is expressed as 

Z te--vX>'i->') 



R = - 



x 100 



(-) 






nTE/3 



■w 



where y. and y t are the observed and estimated values 

respectively and, y and y, are the means of observed and 
estimated values and n is the number of observations. It 
measures the degree of association between the observed 
and estimated values and indicates the relative assessment 
of the model performance in dimensionless measure. 
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B. Root Mean Square Error (RMSE) 

It yields the residual error in terms of the mean square error 
and is expressed as [14]. 



I (y,-y,) 2 

RMSE= J ,=i 



(3) 



C. Efficiency Coefficient (EC) 

It is used to assess the performance of different models [15]. 
It is a better choice than RMSE statistic when the calibration 
and verification periods have different lengths [16]. It 
measures directly the ability of the model to reproduce the 
observed values and is expressed as 



EC: 



1 



xlOO 



(4) 



oy 



where F =X ^,- " yf andF= Z ^, -y,f 

A value of EC of 90% generally indicates a very satisfactory 
model performance while a value in the range 80-90%, a fairly 
good model. Values of EC in the range 60-80% would indicate 
an unsatisfactory model fit. 



V RESULTS AND DISCUSSION 

The multiple correlation analysis has been carried out to 
identify the climatic parameters influencing weekly ET Q in the 
regions selected for the study. It may be observed from 
multiple and partial correlation coefficients presented in 
TABLE II that the sunshine hours, temperature, wind velocity 
and relative humidity mostly influence the regions in the 
weekly ET Q estimation. Linear weekly ET Q regression models 
at the centers have been developed as presented in TABLE 
III. The weekly ET estimation at the meteorological centers 
has also been carried out using different artificial neural 
network architectures with input nodes ranging from one to 
four and, varying the number of nodes in the hidden layer. 
The ANNs with four input nodes and a hidden layer with 
four nodes i.e. ANN (4,4,1) have been identified as optimal 
architectures. The performance indices of linear regression 
(LR) and ANN (4,4, 1 ) models on comparison of the results 
with those of FAO-56 Penman -Montieth method are presented 
in TABLE IV It may be observed from the results presented 
in TABLE IV that the values of R 2 and EC of LR models indicate 
a very satisfactory performance. However, the performance 
has improved marginally with optimal artificial neural network 
architectures. 









TABLE II 
MULTIPLE AND PARTIAL CORRELATION COEFFICIENTS 
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TABLE III 
LINEAR REGRESSION MODELS 
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TABLE IV 
PERFORMANCE INDICES OF LR AND ANN (4,4.1) MODELS 
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The values of RMSE of ANN(4,4,1) models have also reduced 
slightly. This maybe due to the fact that weekly average ET 
values do not exhibit much of nonlinearity. The scatter plots 
(not shown in the paper) of ET values estimated using 
Penman-Montieth method against those estimated using LR 
and ANN (4,4, 1) models respectively. The nearly unit slope 
and zero intercept of scatter plots (TABLE IV) indicate the 
closeness of ET values with those of PM method. "Fig. 1" 
presents the comparison of performance of LR and ANN 
(4,4,1) models against PM method during testing period. The 
study reveals that the simple linear regression models 
proposed may be adopted satisfactorily in the weekly ET 
estimation at the centers selected for the present study and, 
the accuracy in the ET estimation may further be improved 
using ANN (4,4, 1 ) models. 



VI. CONCLUSIONS 

The climatic parameters such as sunshine hours, 
temperature, wind velocity and relative humidity mostly 
influenced weekly ET estimation at Tirupati, Nellore, 
Rajahmundry, Anakapalli and Rajendranagar regions of 
Andhra Pradesh. The linear regression models proposed in 
terms of the climatic parameters influencing the regions 
performed satisfactorily in the weekly ET Q estimation. The 
optimal ANN models proposed showed a marginal 
improvement over linear regression models. The linear 
regression models may therefore be adopted for weekly ET 
estimation in the regions with reasonable degree of accuracy 
and, the accuracy may slightly be improved with ANN 
architectures. 
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Figure 1. Comparison of average weekly ET values estimated using LR and ANN models with those estimated by 

Penman-Monteith method during testing period 
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